22 research outputs found
Recommended from our members
How Do Astronomers Share Data? Reliability and Persistence of Datasets Linked in AAS Publications and a Qualitative Study of Data Practices among US Astronomers
We analyze data sharing practices of astronomers over the past fifteen years. An analysis of URL links embedded in papers published by the American Astronomical Society reveals that the total number of links included in the literature rose dramatically from 1997 until 2005, when it leveled off at around 1500 per year. The analysis also shows that the availability of linked material decays with time: in 2011, 44% of links published a decade earlier, in 2001, were broken. A rough analysis of link types reveals that links to data hosted on astronomers' personal websites become unreachable much faster than links to datasets on curated institutional sites. To gauge astronomers' current data sharing practices and preferences further, we performed in-depth interviews with 12 scientists and online surveys with 173 scientists, all at a large astrophysical research institute in the United States: the Harvard-Smithsonian Center for Astrophysics, in Cambridge, MA. Both the in-depth interviews and the online survey indicate that, in principle, there is no philosophical objection to data-sharing among astronomers at this institution. Key reasons that more data are not presently shared more efficiently in astronomy include: the difficulty of sharing large data sets; over reliance on non-robust, non-reproducible mechanisms for sharing data (e.g. emailing it); unfamiliarity with options that make data-sharing easier (faster) and/or more robust; and, lastly, a sense that other researchers would not want the data to be shared. We conclude with a short discussion of a new effort to implement an easy-to-use, robust, system for data sharing in astronomy, at theastrodata.org, and we analyze the uptake of that system to-date
Sharing and Preserving Computational Analyses for Posterity with encapsulator
Open data and open-source software may be part of the solution to science's
"reproducibility crisis", but they are insufficient to guarantee
reproducibility. Requiring minimal end-user expertise, encapsulator creates a
"time capsule" with reproducible code in a self-contained computational
environment. encapsulator provides end-users with a fully-featured desktop
environment for reproducible research.Comment: 11 pages, 6 figure
Recommended from our members
If these data could talk
In the last few decades, data-driven methods have come to dominate many fields of scientific inquiry. Open data and open-source software have enabled the rapid implementation of novel methods to manage and analyze the growing flood of data. However, it has become apparent that many scientfic fields exhibit distressingly low rates of repeatability and reproducibility. Although there are many dimensions to this issue, we believe that there is a lack of formalism used when describing end-to-end published results, from the data source to the analysis to the final published results. Even when authors do their best to make their research and data accessible, this lack of formalism reduces the clarity and effciency of reporting, which contributes to issues of reproducibility. Data provenance
aids both repeatability and reproducibility through systematic and formal records of the relationships among data sources, processes, datasets, publications and researchers.Engineering and Applied SciencesOrganismic and Evolutionary Biolog
Cosmic rays in active galactic nuclei
This work explores the connection between cosmic rays and light element production in an active galaxy environment.
Cosmic rays generated in an active galactic nucleus (AGN) interact with the local, line-emitting gas and spall the light elements, Li, Be and B. Careful consideration of the propagation of cosmic rays from AGNs to Earth yields a variety of models that are consistent with the observed cosmic ray spectrum. However, by using observed upper limits for BIII 2066A line emission from AGNs, we are able to rule out certain cosmic ray flux models. This analysis requires a detailed study of boron ionization balance under typical AGN conditions, a study that is carried out here for the first time. Models with a total cosmic ray luminosity L\sb{CR}=10\sp{45} erg s\sp{-1} and a diffusion coefficient in the line emission region of D\le 10\sp{28} cm\sp2 s\sp{-1}, and those with L\sb{CR}=10\sp{45} erg s\sp{-1} and D\le 3\times 10\sp{26} cm\sp2 s\sp{-1} do not satisfy the spectroscopic constraints. However, models with lower cosmic ray luminosities or larger diffusion coefficients are acceptable.
The results of spallation in AGNs are also applied to our Galaxy, under the assumption that it has passed through an active phase. An additional source of light elements during this active phase can reproduce the B and Be abundances observed in the halo, and contribute partially to the light element abundances observed in the disk
Connecting Data Repositories with the Research Life Cycle
Since the Dataverse Project --an open source data publishing framework developed at Harvard's Institute for Quantitative Social Science-- released its SWORD API for data deposit in 2013, several stakeholders have developed integrations specifically into the Harvard Dataverse (<a href="https://dataverse.harvard.edu/" target="_blank">https://dataverse.harvard.edu/</a>) and other Dataverse-based repositories with various automated workflows throughout the research life cycle. This poster will demonstrate the technology necessary for interoperability between different systems with Dataverse, and highlight a number of these automated use cases which can occur at different times during the research lifecycle. Some examples include: researchers using R to deposit data and scripts into Dataverse (Dataverse R package on CRAN) or to archive data from a research project (OSF Dataverse Add-On); authors submitting data for a journal article (OJS and ScholarOne); and preserving research data for the long term using Archivematica
Recommended from our members
Automating Open Science for Big Data
The vast majority of social science research presently uses small (MB or GB scale) data sets. These fixed scale sets are commonly downloaded to the researcher's computer where the analysis is performed locally, and are often shared and cited with well-established technologies, such as the Dataverse Project (see Dataverse.org), to support the published results. The trend towards Big Data - including large scale streaming data - is starting to transform research and has the potential to impact policy-making and our understanding of the social, economic, and political problems that affect human societies. However, this research poses new challenges in execution, accountability, preservation, reuse, and reproducibility. Downloading these data sets to a researcher's computer is infeasible or not practical; hence, analyses take place in the cloud, require unusual expertise, and benefit from collaborative teamwork and novel tool development. The advantage of these data sets in how informative they are also means that they are much more likely to contain highly sensitive personally identifiable information. In this paper, we discuss solutions to these new challenges so that the social sciences can realize the potential of Big Data.Engineering and Applied Science
Data for: How Do Astronomers Share Data? Reliability and Persistence of Datasets Linked in AAS Publications and a Qualitative Study of Data Practices among US Astronomers
A corpus of articles published between 1997 and 2008 in the four main astronomy journals (The Astrophysical Journal, The Astrophysical Journal Letters, Astronomy
& Astrophysics, The Astronomical Journal) which contain external URL links in their full text